gh-148276: Optimize object creation and method calls in the JIT by resolving __init__ at trace optimization time#148277
gh-148276: Optimize object creation and method calls in the JIT by resolving __init__ at trace optimization time#148277eendebakpt wants to merge 4 commits intopython:mainfrom
Conversation
…nt type guards - _CHECK_AND_ALLOCATE_OBJECT: resolve __init__ from type's _spec_cache so the optimizer can follow into __init__ bodies - _GUARD_TYPE_VERSION_LOCKED: add optimizer handler to track type version and NOP redundant guards on the same object - Add test_guard_type_version_locked_removed Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fidget-Spinner
left a comment
There was a problem hiding this comment.
LGTM, just two comments on wording.
Lib/test/test_capi/test_opt.py
Outdated
| enabling the optimizer to trace into the init frame and eliminate | ||
| redundant function version and arg count checks. |
There was a problem hiding this comment.
This has nothing to do with tracing into the init frame. We already do that. it's more of propagating information through the frame
| op(_GUARD_TYPE_VERSION_LOCKED, (type_version/2, owner -- owner)) { | ||
| assert(type_version); | ||
| if (sym_matches_type_version(owner, type_version)) { | ||
| ADD_OP(_NOP, 0, 0); |
There was a problem hiding this comment.
We should not be removing this as we are moving towards FT compatibility. This uop unlocks objects on FT as well, so we need to keep it around as it's side effecting.
Instead, you should break out the _GUARD_TYPE_VERSION_LOCKED into _GUARD_TYPE_VERSION + UNLOCK. See for example the _LOCK_OBJECT op.
There was a problem hiding this comment.
I do not fully understand. The unlock only happens when the type version doesn't match. If that cannot happen, there is no need to keep the unlock part or is there?
There was a problem hiding this comment.
I'm renaming the opcode references from _GUARD_TYPE_VERSION_LOCKED to _GUARD_TYPE_VERSION but I'm not sure how to add the unlock part. I would like some help with that part.
Co-authored-by: Ken Jin <kenjin4096@gmail.com>
MazinSharaf
left a comment
There was a problem hiding this comment.
Just changing the opcode references from _GUARD_TYPE_VERSION_LOCKED to _GUARD_TYPE_VERSION. This is going according to the review on a comment in optimizer_bytecodes.c specifically R140. However, I have not fulfilled it fully, and I would just like some help in adding the rest of the changes, in terms of adding the unlock stuff.
| @@ -134,6 +134,21 @@ dummy_func(void) { | |||
| assert(!PyJitRef_IsUnique(value)); | |||
| } | |||
|
|
|||
| op(_GUARD_TYPE_VERSION_LOCKED, (type_version/2, owner -- owner)) { | |||
There was a problem hiding this comment.
| op(_GUARD_TYPE_VERSION_LOCKED, (type_version/2, owner -- owner)) { | |
| op(_GUARD_TYPE_VERSION, (type_version/2, owner -- owner)) { |
There was a problem hiding this comment.
This is in reference to the comment on R140
There was a problem hiding this comment.
It's not the full solution, but it's going towards that direction.
There was a problem hiding this comment.
Still need to add the unlock/deopt stuff, but I'm not sure how quite to go about it, I am still learning about the code here. Would be nice if someone could help me with that. Thanks!
| @@ -2721,6 +2721,11 @@ dummy_func( | |||
| } | |||
|
|
|||
| op(_GUARD_TYPE_VERSION_LOCKED, (type_version/2, owner -- owner)) { | |||
There was a problem hiding this comment.
| op(_GUARD_TYPE_VERSION_LOCKED, (type_version/2, owner -- owner)) { | |
| op(_GUARD_TYPE_VERSION, (type_version/2, owner -- owner)) { |
Refer to R137's comments and R140's comments on optimizer_bytecodes.c
| @@ -2721,6 +2721,11 @@ dummy_func( | |||
| } | |||
|
|
|||
| op(_GUARD_TYPE_VERSION_LOCKED, (type_version/2, owner -- owner)) { | |||
| // Guard that type version matches expected value. Object is assumed to be | |||
There was a problem hiding this comment.
| // Guard that type version matches expected value. Object is assumed to be |
| @@ -2721,6 +2721,11 @@ dummy_func( | |||
| } | |||
|
|
|||
| op(_GUARD_TYPE_VERSION_LOCKED, (type_version/2, owner -- owner)) { | |||
| // Guard that type version matches expected value. Object is assumed to be | |||
| // locked on entry. If version matches, lock is retained for subsequent | |||
There was a problem hiding this comment.
| // locked on entry. If version matches, lock is retained for subsequent |
| @@ -2721,6 +2721,11 @@ dummy_func( | |||
| } | |||
|
|
|||
| op(_GUARD_TYPE_VERSION_LOCKED, (type_version/2, owner -- owner)) { | |||
| // Guard that type version matches expected value. Object is assumed to be | |||
| // locked on entry. If version matches, lock is retained for subsequent | |||
| // operations. If mismatch, unlock and exit (deopt). This allows the JIT | |||
There was a problem hiding this comment.
| // operations. If mismatch, unlock and exit (deopt). This allows the JIT |
| // locked on entry. If version matches, lock is retained for subsequent | ||
| // operations. If mismatch, unlock and exit (deopt). This allows the JIT | ||
| // optimizer to eliminate this guard entirely if type version is proven, | ||
| // in which case the lock is held for the entire trace duration. |
There was a problem hiding this comment.
| // in which case the lock is held for the entire trace duration. |
| Foo.attr = 0 | ||
| self.assertFalse(ex.is_valid()) | ||
|
|
||
| def test_guard_type_version_locked_removed(self): |
There was a problem hiding this comment.
| def test_guard_type_version_locked_removed(self): | |
| def test_guard_type_version_removed(self): |
Please refer to R137's and R140's comments on optimizer_bytecodes.c
| op(_GUARD_TYPE_VERSION_LOCKED, (type_version/2, owner -- owner)) { | ||
| assert(type_version); | ||
| if (sym_matches_type_version(owner, type_version)) { | ||
| ADD_OP(_NOP, 0, 0); |
There was a problem hiding this comment.
I'm renaming the opcode references from _GUARD_TYPE_VERSION_LOCKED to _GUARD_TYPE_VERSION but I'm not sure how to add the unlock part. I would like some help with that part.
| res, ex = self._run_with_optimizer(thing, TIER2_THRESHOLD) | ||
| self.assertIsNotNone(ex) | ||
| opnames = list(iter_opnames(ex)) | ||
| guard_locked_count = opnames.count("_GUARD_TYPE_VERSION_LOCKED") |
There was a problem hiding this comment.
| guard_locked_count = opnames.count("_GUARD_TYPE_VERSION_LOCKED") | |
| guard_count = opnames.count("_GUARD_TYPE_VERSION") |
|
|
||
| def test_guard_type_version_locked_removed(self): | ||
| """ | ||
| Verify that redundant _GUARD_TYPE_VERSION_LOCKED guards are |
There was a problem hiding this comment.
| Verify that redundant _GUARD_TYPE_VERSION_LOCKED guards are | |
| Verify that redundant _GUARD_TYPE_VERSION guards are |
Optimize object creation and method calls in the JIT by resolving
__init__at trace compile time and eliminating redundant type guards. The idea was picked up when experimenting with the ideas in #144388 using Claude Code.Changes
_CHECK_AND_ALLOCATE_OBJECT: resolve the__init__function to a constant via_spec_cache.init, allowing the optimizer to eliminate_CHECK_FUNCTION_VERSIONand_CHECK_FUNCTION_EXACT_ARGSfor the init call_GUARD_TYPE_VERSION_LOCKED: propagate type version info so repeated guards on the same type within a trace are NOPedBenchmark (release JIT, x86_64) on
Point(x, y)p.translate().dist()v.scale().add().dot()Object creation + method chains are 1.2-1.3x faster. Simple method calls and descriptors are unchanged.
Details
<.summary>__init__at trace compile time #148276